Research Goal

We are looking at different drugs to understand if they are resistance or not to malaria parasite called plasmodium falciparum and how different genes associate with this resistance in the Nigeria, between and within locations. The work below focuses on the drug chloroquine.

Analysis steps

  1. Clean the data
  2. Generate summary tables to show the proportion of drug resistance for different locations
  3. Generate different plots to show drug resistance within locations in the Nigeria
  4. Calculate the Identity By State (IBS) using samples barcodes
  5. Using the IBS matrix to generate different connections and cluster plots

Cleaning the data

  1. Filter for samples based on only the plasmodium falciparum from the dataset

Proportion tables and plots showing sample counts and drug resistance and sensitivity for chloroquine

Sample Count map

This map shows a sample count. The circle sizes represent the number of samples from the specific location and from the tables we can also see the number of samples each location has in the brackets. From the plot we can see the highest is Ibadan state with 115 samples between the years 2017 to 2020 and the lowest is Edo state with 2 samples.

## Reading layer `gadm41_NGA_2' from data source 
##   `/Users/sangmariecolley/Desktop/GRC_Analysis/Nigerian samples/gadm41_NGA_2.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 775 features and 0 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 2.668431 ymin: 4.270418 xmax: 14.67642 ymax: 13.89201
## CRS:           NA

Grouped barchat showing chloroquine drug satus by location.

Here we have a table and a grouped bar chart showing the count for each drug status category by location.

Chloroquine Resistance map

This map shows the amount of samples that are resistant to the drug chloroquine. In each circle we see the proportion of resistant samples for each location. This proportions are calculated by dividing the amount of resistant samples coming from each location over the total amount of samples in the entire dataset.

Chloroquine Sensitive map

This map shows the amount of samples that are sensitive to the drug chloroquine. In each circle we see the proportion of sensitive samples for each location. This proportions are calculated by dividing the amount of resistant samples coming from each location over the total amount of samples in the entire dataset.

Chloroquine Mixed Resistance map

This map shows the amount of samples that are mixed Resistance to the drug chloroquine.

Chloroquine Mixed Resistance plus Resistant map

This map shows a combination of samples that are mixed resistance plus those that are resistant to the drug chloroquine.

Calculating the Identity by State(IBS) matrix for futher analysis

Our next stage of the analysis is to calculate a IBS matrix using the gene barcodes for each sample. The IBS matrix quantifies the proportion of alleles that are identical between pairs of individuals at specific genetic loci. Using the IBS matrix we can get information like genetic similarities, population structure, relatedness and genetic diversity. Before calculating the IBS matrix we filtered only for loci with less than 30% of barcode missingness. We are now left with 474 samples and 44 loci to use for downstream analysis.

3D PCoA Plot

The figure show a 3D principle coordinate analysis plot using our IBS matrix

The below figures shows comparisons between the different dimensions generated

IBS Pair Count

Here we have a table showing the amount of pairs each unique location pair has based on our 13 locations in Nigeria. We had a total of 836682 pairs from the ibs matrix. Pair_prob here in our talbe below is the result of dividing each location pair count over the total number of pairs from the ibs matrix.

IBS Heatmaps

Below we have a heatmap showing a combination of resistant, mixed.resistant, sensitive, and mixed.sensitive pairs from the ibs matrix with the color gradient spanning from the lowest to the highest ibs score.

Generating connectivity plots

The connectivity plot helps us to show how each state from our dataset relates to the other and the line width is what shows us how closely related these states are.

#### Network Plots showing the different pairs Network plot using the Fruchterman-Reingold force-directed algorithm (FR) layout Network plot using the Circle layout - layout that places the vertices on a circle

Network plot using the Kamada-Kawai force-directed algorithm (KK) layout

Presented igraph output using the edgebundle package

Phylogeny Tree